Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Sensitive information detection method based on attention mechanism-based ELMo
Cheng HUANG, Qianrui ZHAO
Journal of Computer Applications    2022, 42 (7): 2009-2014.   DOI: 10.11772/j.issn.1001-9081.2021050877
Abstract736)   HTML44)    PDF (973KB)(297)       Save

In order to solve the problems of low accuracy and poor generalization of the traditional sensitive information detection methods such as keyword character matching-based method and phrase-level sentiment analysis-based method, a sensitive information detection method based on Attention mechanism-based Embedding from Language Model (A-ELMo) was proposed. Firstly, the quick matched of trie tree was performed to reduce the comparison of useless words significantly, thereby improving the query efficiency greatly. Secondly, an Embedding from Language Model (ELMo) was constructed for context analysis, and the dynamic word vectors were used to fully represent the context characteristics to achieve high scalability. Finally, the attention mechanism was combined to enhance the identification ability of the model for sensitive features, and further improve the detection rate of sensitive information. Experiments were carried out on real datasets composed of multiple network data sources. The results show that the accuracy of the proposed sensitive information detection method is improved by 13.3 percentage points compared with that of the phrase-level sentiment analysis-based method, and the accuracy of the proposed method is improved by 43.5 percentage points compared with that of the keyword matching-based method, verifying that the proposed method has advantages in terms of enhancing identification ability of sensitive features and improving the detection rate of sensitive information.

Table and Figures | Reference | Related Articles | Metrics
Text segmentation model based on graph convolutional network
Yuqi DU, Jin ZHENG, Yang WANG, Cheng HUANG, Ping LI
Journal of Computer Applications    2022, 42 (12): 3692-3699.   DOI: 10.11772/j.issn.1001-9081.2021101768
Abstract444)   HTML24)    PDF (2746KB)(213)       Save

The main task of text segmentation is to divide the text into several relatively independent text blocks according to the topic relevance. Aiming at the shortcomings of the existing text segmentation models in extracting fine-grained features such as text paragraph structural information, semantic correlation and context interaction, a text segmentation model TS-GCN (Text Segmentation-Graph Convolutional Network) based on Graph Convolutional Network (GCN) was proposed. Firstly, a text graph based on the structural information and semantic logic of text paragraphs was constructed. Then, the semantic similarity attention was introduced to capture the fine-grained correlation between text paragraph nodes, and the information transmission between high-order neighborhoods of text paragraph nodes was realized with the help of GCN, so that the model ability of multi-granularity extraction of text paragraph topic feature representations was enhanced. The proposed model was compared with the representative model CATS (Coherence-Aware Text Segmentation), and its basic model TLT-TS (Two-Level Transformer model for Text Segmentation), which were commonly used as benchmarks for text segmentation task. Experimental results show that TS-GCN’s evaluation index Pk is 0.08 percentage points lower than that of TLT-TS without any auxiliary module on Wikicities dataset. And the proposed model has the Pk value decreased by 0.38 percentage points and 2.30 percentage points respectively on Wikielements dataset compared with CATS and TLT-TS. It can be seen that TS-GCN achieves good segmentation effect.

Table and Figures | Reference | Related Articles | Metrics